Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🤖 Transformer Architecture
Self-Attention, BERT, GPT, Multi-Head Attention
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
122540
posts in
1.13
s
How
Transformer
Architecture
Powers
LLMs
dev.to
·
4h
·
Discuss:
DEV
🔄
Sequence-to-Sequence Models
Interpretable
Vision Transformers in Image Classification via
SVDA
arxiv.org
·
12h
🗄️
Vector Databases
Wavelet
Meets Adam:
Compressing
Gradients for Memory-Efficient Training
chipublib.idm.oclc.org
·
1d
🧠
Neural Network Architectures
Looping
Back to Move Forward:
Recursive
Transformers for Efficient and Flexible Large Multimodal Models
arxiv.org
·
1d
🔄
Sequence-to-Sequence Models
An uncertainty aware transformer framework for wind power
forecasting
with
multiscale
attention and adaptive feature fusion
chipublib.idm.oclc.org
·
1d
📈
Time Series Forecasting
Cuentos
: A Large-Scale Eye-Tracking Reading
Corpus
on Spanish Narrative Texts
nature.com
·
11h
🔄
Sequence-to-Sequence Models
How
Andrej
Karpathy
Built a Working Transformer in 243 Lines of Code
analyticsvidhya.com
·
4h
🚀
Model Deployment
Beyond
Kuramoto
Models: Associative Memory and Plastic
Synapses
in ML Ensembles
hackernoon.com
·
1d
🧠
Neural Network Architectures
The 4 Mixture of Experts Architectures: How to Train
100B
Models at
10B
Cost
pub.towardsai.net
·
4h
🧠
Deep Learning
Carnegie
Mellon at
NeurIPS
2025
blog.ml.cmu.edu
·
1d
🧠
Deep Learning
Multi-TPC
: A Multimodal Dataset for Three-Party Conversations with Speech, Motion, and
Gaze
nature.com
·
15h
🔄
Sequence-to-Sequence Models
YORU
: Animal behavior detection with object-based approach for real-time
closed-loop
feedback
science.org
·
1d
🧠
Deep Learning
A History of Large Language Models
gregorygundersen.com
·
17h
🔄
Sequence-to-Sequence Models
The 4 Flash Attention
Variants
: How to Train
Transformers
10× Longer Without Running Out of Memory
pub.towardsai.net
·
4d
👁️
Attention Mechanisms
A C implementation of the inference pipeline for the Mistral AI’s
Voxtral
Realtime
4B model
blog.adafruit.com
·
1h
🧠
Neural Network Architectures
Gibbs Measures from Deep Shaped
Multilayer
Perceptrons
link.aps.org
·
4h
🧠
Deep Learning
Training-Free Real-Time Control for
Autoregressive
Video Generation
daydream.live
·
3h
·
Discuss:
Hacker News
🎲
Synthetic Data Generation
Digitizing
the "
Shokunin
": How we encoded a Master's hammer strike into AI
yusukekaizen.substack.com
·
11h
·
Discuss:
Substack
🤖
AI
Transformer-Based Memory Forecasting: Leveraging
Anonymized
Aggregates
for Personal Insights
novice.media
·
20h
·
Discuss:
Hacker News
🔄
LSTM Networks
An
assistive
robot learns to set and clear the table by
observing
humans
techxplore.com
·
19h
🤖
AI
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help